Microphone array system

ABSTRACT

A method and system for enhancing a target sound signal from multiple sound signals is provided. An array of an arbitrary number of sound sensors positioned in an arbitrary configuration receives the sound signals from multiple disparate sources. The sound signals comprise the target sound signal from a target sound source, and ambient noise signals. A sound source localization unit, an adaptive beamforming unit, and a noise reduction unit are in operative communication with the array of sound sensors. The sound source localization unit estimates a spatial location of the target sound signal from the received sound signals. The adaptive beamforming unit performs adaptive beamforming by steering a directivity pattern of the array of sound sensors in a direction of the spatial location of the target sound signal, thereby enhancing the target sound signal and partially suppressing the ambient noise signals, which are further suppressed by the noise reduction unit.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a reissue application of U.S. patent applicationSer. No. 13/049,877, filed Mar. 16, 2011 (now U.S. Pat. No. 8,861,756),which claims the benefit of provisional patent application No.61/403,952 titled “Microphone array design and implementation fortelecommunications and handheld devices”, filed on Sep. 24, 2010 in theUnited States Patent and Trademark Office.

The specification of the above referenced patent application isincorporated herein by reference in its entirety.

BACKGROUND

Microphones constitute an important element in today's speechacquisition devices. Currently, most of the hands-free speechacquisition devices, for example, mobile devices, lapels, headsets,etc., convert sound into electrical signals by using a microphoneembedded within the speech acquisition device. However, the paradigm ofa single microphone often does not work effectively because themicrophone picks up many ambient noise signals in addition to thedesired sound, specifically when the distance between a user and themicrophone is more than a few inches. Therefore, there is a need for amicrophone system that operates under a variety of different ambientnoise conditions and that places fewer constraints on the user withrespect to the microphone, thereby eliminating the need to wear themicrophone or be in close proximity to the microphone.

To mitigate the drawbacks of the single microphone system, there is aneed for a microphone array that achieves directional gain in apreferred spatial direction while suppressing ambient noise from otherdirections. Conventional microphone arrays include arrays that aretypically developed for applications such as radar and sonar, but aregenerally not suitable for hands-free or handheld speech acquisitiondevices. The main reason is that the desired sound signal has anextremely wide bandwidth relative to its center frequency, therebyrendering conventional narrowband techniques employed in theconventional microphone arrays unsuitable. In order to cater to suchbroadband speech applications, the array size needs to be vastlyincreased, making the conventional microphone arrays large and bulky,and precluding the conventional microphone arrays from having broaderapplications, for example, in mobile and handheld communication devices.There is a need for a microphone array system that provides an effectiveresponse over a wide spectrum of frequencies while being unobtrusive interms of size.

Hence, there is a long felt but unresolved need for a broadbandmicrophone array and broadband beamforming system that enhancesacoustics of a desired sound signal while suppressing ambient noisesignals.

SUMMARY OF THE INVENTION

This summary is provided to introduce a selection of concepts in asimplified form that are further described in the detailed descriptionof the invention. This summary is not intended to identify key oressential inventive concepts of the claimed subject matter, nor is itintended for determining the scope of the claimed subject matter.

The method and system disclosed herein addresses the above stated needfor enhancing acoustics of a target sound signal received from a targetsound source, while suppressing ambient noise signals. As used herein,the term “target sound signal” refers to a sound signal from a desiredor target sound source, for example, a person's speech that needs to beenhanced. A microphone array system comprising an array of sound sensorspositioned in an arbitrary configuration, a sound source localizationunit, an adaptive beamforming unit, and a noise reduction unit, isprovided. The sound source localization unit, the adaptive beamformingunit, and the noise reduction unit are in operative communication withthe array of sound sensors. The array of sound sensors is, for example,a linear array of sound sensors, a circular array of sound sensors, oran arbitrarily distributed coplanar array of sound sensors. The array ofsound sensors herein referred to as a “microphone array” receives soundsignals from multiple disparate sound sources. The method disclosedherein can be applied on a microphone array with an arbitrary number ofsound sensors having, for example, an arbitrary two dimensional (2D)configuration. The sound signals received by the sound sensors in themicrophone array comprise the target sound signal from the target soundsource among the disparate sound sources, and ambient noise signals.

The sound source localization unit estimates a spatial location of thetarget sound signal from the received sound signals, for example, usinga steered response power-phase transform. The adaptive beamforming unitperforms adaptive beamforming for steering a directivity pattern of themicrophone array in a direction of the spatial location of the targetsound signal. The adaptive beamforming unit thereby enhances the targetsound signal from the target sound source and partially suppresses theambient noise signals. The noise reduction unit suppresses the ambientnoise signals for further enhancing the target sound signal receivedfrom the target sound source.

In an embodiment where the target sound source that emits the targetsound signal is in a two dimensional plane, a delay between each of thesound sensors and an origin of the microphone array is determined as afunction of distance between each of the sound sensors and the origin, apredefined angle between each of the sound sensors and a reference axis,and an azimuth angle between the reference axis and the target soundsignal. In another embodiment where the target sound source that emitsthe target sound signal is in a three dimensional plane, the delaybetween each of the sound sensors and the origin of the microphone arrayis determined as a function of distance between each of the soundsensors and the origin, a predefined angle between each of the soundsensors and a first reference axis, an elevation angle between a secondreference axis and the target sound signal, and an azimuth angle betweenthe first reference axis and the target sound signal. This method ofdetermining the delay enables beamforming for arbitrary numbers of soundsensors and multiple arbitrary microphone array configurations. Thedelay is determined, for example, in terms of number of samples. Oncethe delay is determined, the microphone array can be aligned to enhancethe target sound signal from a specific direction.

The adaptive beamforming unit comprises a fixed beamformer, a blockingmatrix, and an adaptive filter. The fixed beamformer steers thedirectivity pattern of the microphone array in the direction of thespatial location of the target sound signal from the target sound sourcefor enhancing the target sound signal, when the target sound source isin motion. The blocking matrix feeds the ambient noise signals to theadaptive filter by blocking the target sound signal from the targetsound source. The adaptive filter adaptively filters the ambient noisesignals in response to detecting the presence or absence of the targetsound signal in the sound signals received from the disparate soundsources. The fixed beamformer performs fixed beamforming, for example,by filtering and summing output sound signals from the sound sensors.

In an embodiment, the adaptive filtering comprises sub-band adaptivefiltering. The adaptive filter comprises an analysis filter bank, anadaptive filter matrix, and a synthesis filter bank. The analysis filterbank splits the enhanced target sound signal from the fixed beamformerand the ambient noise signals from the blocking matrix into multiplefrequency sub-bands. The adaptive filter matrix adaptively filters theambient noise signals in each of the frequency sub-bands in response todetecting the presence or absence of the target sound signal in thesound signals received from the disparate sound sources. The synthesisfilter bank synthesizes a full-band sound signal using the frequencysub-bands of the enhanced target sound signal. In an embodiment, theadaptive beamforming unit further comprises an adaptation control unitfor detecting the presence of the target sound signal and adjusting astep size for the adaptive filtering in response to detecting thepresence or the absence of the target sound signal in the sound signalsreceived from the disparate sound sources.

The noise reduction unit suppresses the ambient noise signals forfurther enhancing the target sound signal from the target sound source.The noise reduction unit performs noise reduction, for example, by usinga Wiener-filter based noise reduction algorithm, a spectral subtractionnoise reduction algorithm, an auditory transform based noise reductionalgorithm, or a model based noise reduction algorithm. The noisereduction unit performs noise reduction in multiple frequency sub-bandsemployed for sub-band adaptive beamforming by the analysis filter bankof the adaptive beamforming unit.

The microphone array system disclosed herein comprising the microphonearray with an arbitrary number of sound sensors positioned in arbitraryconfigurations can be implemented in handheld devices, for example, theiPad® of Apple Inc., the iPhone® of Apple Inc., smart phones, tabletcomputers, laptop computers, etc. The microphone array system disclosedherein can further be implemented in conference phones, videoconferencing applications, or any device or equipment that needs betterspeech inputs.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing summary, as well as the following detailed description ofthe invention, is better understood when read in conjunction with theappended drawings. For the purpose of illustrating the invention,exemplary constructions of the invention are shown in the drawings.However, the invention is not limited to the specific methods andinstrumentalities disclosed herein.

FIG. 1 illustrates a method for enhancing a target sound signal frommultiple sound signals.

FIG. 2 illustrates a system for enhancing a target sound signal frommultiple sound signals.

FIG. 3 exemplarily illustrates a microphone array configuration showinga microphone array having N sound sensors arbitrarily distributed on acircle.

FIG. 4 exemplarily illustrates a graphical representation of afilter-and-sum beamforming algorithm for determining output of themicrophone array having N sound sensors.

FIG. 5 exemplarily illustrates distances between an origin of themicrophone array and sound sensor M₁ and sound sensor M₃ in the circularmicrophone array configuration, when the target sound signal is at anangle θ from the Y-axis.

FIG. 6A exemplarily illustrates a table showing the distance betweeneach sound sensor in a circular microphone array configuration from theorigin of the microphone array, when the target sound source is in thesame plane as that of the microphone array.

FIG. 6B exemplarily illustrates a table showing the relationship of theposition of each sound sensor in the circular microphone arrayconfiguration and its distance to the origin of the microphone array,when the target sound source is in the same plane as that of themicrophone array.

FIG. 7A exemplarily illustrates a graphical representation of amicrophone array, when the target sound source is in a three dimensionalplane.

FIG. 7B exemplarily illustrates a table showing delay between each soundsensor in a circular microphone array configuration and the origin ofthe microphone array, when the target sound source is in a threedimensional plane.

FIG. 7C exemplarily illustrates a three dimensional working space of themicrophone array, where the target sound signal is incident at anelevation angle Ψ<Ω

FIG. 8 exemplarily illustrates a method for estimating a spatiallocation of the target sound signal from the target sound source by asound source localization unit using a steered response power-phasetransform.

FIG. 9A exemplarily illustrates a graph showing the value of the steeredresponse power-phase transform for every 10°.

FIG. 9B exemplarily illustrates a graph representing the estimatedtarget sound signal from the target sound source.

FIG. 10 exemplarily illustrates a system for performing adaptivebeamforming by an adaptive beamforming unit.

FIG. 11 exemplarily illustrates a system for sub-band adaptivefiltering.

FIG. 12 exemplarily illustrates a graphical representation showing theperformance of a perfect reconstruction filter bank.

FIG. 13 exemplarily illustrates a block diagram of a noise reductionunit that performs noise reduction using a Wiener-filter based noisereduction algorithm.

FIG. 14 exemplarily illustrates a hardware implementation of themicrophone array system.

FIGS. 15A-15C exemplarily illustrate a conference phone comprising aneight-sensor microphone array.

FIG. 16A exemplarily illustrates a layout of an eight-sensor microphonearray for a conference phone.

FIG. 16B exemplarily illustrates a graphical representation of eightspatial regions to which the eight-sensor microphone array of FIG. 16Aresponds.

FIGS. 16C-16D exemplarily illustrate computer simulations showing thesteering of the directivity patterns of the eight-sensor microphonearray of FIG. 16A in the directions of 15° and 60° respectively, in thefrequency range 300 Hz to 5 kHz.

FIGS. 16E-16L exemplarily illustrate graphical representations showingthe directivity patterns of the eight-sensor microphone array of FIG.16A in each of the eight spatial regions, where each directivity patternis an average response from 300 Hz to 5000 Hz.

FIG. 17A exemplarily illustrates a graphical representation of fourspatial regions to which a four-sensor microphone array for a wirelesshandheld device responds.

FIGS. 17B-17I exemplarily illustrate computer simulations showing thedirectivity patterns of the four-sensor microphone array of FIG. 17Awith respect to azimuth and frequency.

FIGS. 18A-18B exemplarily illustrate a microphone array configurationfor a tablet computer.

FIG. 18C exemplarily illustrates an acoustic beam formed using themicrophone array configuration of FIGS. 18A-18B according to the methodand system disclosed herein.

FIGS. 18D-18G exemplarily illustrate graphs showing processing resultsof the adaptive beamforming unit and the noise reduction unit for themicrophone array configuration of FIG. 18B, in both a time domain and aspectral domain for the tablet computer.

FIGS. 19A-19F exemplarily illustrate tables showing different microphonearray configurations and the corresponding values of delay τ_(n), forthe sound sensors in each of the microphone array configurations.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 illustrates a method for enhancing a target sound signal frommultiple sound signals. As used herein, the term “target sound signal”refers to a desired sound signal from a desired or target sound source,for example, a person's speech that needs to be enhanced. The methoddisclosed herein provides 101 a microphone array system comprising anarray of sound sensors positioned in an arbitrary configuration, a soundsource localization unit, an adaptive beamforming unit, and a noisereduction unit. The sound source localization unit, the adaptivebeamforming unit, and the noise reduction unit are in operativecommunication with the array of sound sensors. The microphone arraysystem disclosed herein employs the array of sound sensors positioned inan arbitrary configuration, the sound source localization unit, theadaptive beamforming unit, and the noise reduction unit for enhancing atarget sound signal by acoustic beam forming in the direction of thetarget sound signal in the presence of ambient noise signals.

The array of sound sensors herein referred to as a “microphone array”comprises multiple or an arbitrary number of sound sensors, for example,microphones, operating in tandem. The microphone array refers to anarray of an arbitrary number of sound sensors positioned in an arbitraryconfiguration. The sound sensors are transducers that detect sound andconvert the sound into electrical signals. The sound sensors are, forexample, condenser microphones, piezoelectric microphones, etc.

The sound sensors receive 102 sound signals from multiple disparatesound sources and directions. The target sound source that emits thetarget sound signal is one of the disparate sound sources. As usedherein, the term “sound signals” refers to composite sound energy frommultiple disparate sound sources in an environment of the microphonearray. The sound signals comprise the target sound signal from thetarget sound source and the ambient noise signals. The sound sensors arepositioned in an arbitrary planar configuration herein referred to as a“microphone array configuration”, for example, a linear configuration, acircular configuration, any arbitrarily distributed coplanar arrayconfiguration, etc. By employing beamforming according to the methoddisclosed herein, the microphone array provides a higher response to thetarget sound signal received from a particular direction than to thesound signals from other directions. A plot of the response of themicrophone array versus frequency and direction of arrival of the soundsignals is referred to as a directivity pattern of the microphone array.

The sound source localization unit estimates 103 a spatial location ofthe target sound signal from the received sound signals. In anembodiment, the sound source localization unit estimates the spatiallocation of the target sound signal from the target sound source, forexample, using a steered response power-phase transform as disclosed inthe detailed description of FIG. 8.

The adaptive beamforming unit performs adaptive beamforming 104 bysteering the directivity pattern of the microphone array in a directionof the spatial location of the target sound signal, thereby enhancingthe target sound signal, and partially suppressing the ambient noisesignals. Beamforming refers to a signal processing technique used in themicrophone array for directional signal reception, that is, spatialfiltering. This spatial filtering is achieved by using adaptive or fixedmethods. Spatial filtering refers to separating two signals withoverlapping frequency content that originate from different spatiallocations.

The noise reduction unit performs noise reduction by further suppressing105 the ambient noise signals and thereby further enhancing the targetsound signal. The noise reduction unit performs the noise reduction, forexample, by using a Wiener-filter based noise reduction algorithm, aspectral subtraction noise reduction algorithm, an auditory transformbased noise reduction algorithm, or a model based noise reductionalgorithm.

FIG. 2 illustrates a system 200 for enhancing a target sound signal frommultiple sound signals. The system 200, herein referred to as a“microphone array system”, comprises the array 201 of sound sensorspositioned in an arbitrary configuration, the sound source localizationunit 202, the adaptive beamforming unit 203, and the noise reductionunit 207.

The array 201 of sound sensors, herein referred to as the “microphonearray” is in operative communication with the sound source localizationunit 202, the adaptive beamforming unit 203, and the noise reductionunit 207. The microphone array 201 is, for example, a linear array ofsound sensors, a circular array of sound sensors, or an arbitrarilydistributed coplanar array of sound sensors. The microphone array 201achieves directional gain in any preferred spatial direction andfrequency band while suppressing signals from other spatial directionsand frequency bands. The sound sensors receive the sound signalscomprising the target sound signal and ambient noise signals frommultiple disparate sound sources, where one of the disparate soundsources is the target sound source that emits the target sound signal.

The sound source localization unit 202 estimates the spatial location ofthe target sound signal from the received sound signals. In anembodiment, the sound source localization unit 202 uses, for example, asteered response power-phase transform, for estimating the spatiallocation of the target sound signal from the target sound source.

The adaptive beamforming unit 203 steers the directivity pattern of themicrophone array 201 in a direction of the spatial location of thetarget sound signal, thereby enhancing the target sound signal andpartially suppressing the ambient noise signals. The adaptivebeamforming unit 203 comprises a fixed beamformer 204, a blocking matrix205, and an adaptive filter 206 as disclosed in the detailed descriptionof FIG. 10. The fixed beamformer 204 performs fixed beamforming byfiltering and summing output sound signals from each of the soundsensors in the microphone array 201 as disclosed in the detaileddescription of FIG. 4. In an embodiment, the adaptive filter 206 isimplemented as a set of sub-band adaptive filters. The adaptive filter206 comprises an analysis filter bank 206a, an adaptive filter matrix206b, and a synthesis filter bank 206c as disclosed in the detaileddescription of FIG. 11.

The noise reduction unit 207 further suppresses the ambient noisesignals for further enhancing the target sound signal. The noisereduction unit 207 is, for example, a Wiener-filter based noisereduction unit, a spectral subtraction noise reduction unit, an auditorytransform based noise reduction unit, or a model based noise reductionunit.

FIG. 3 exemplarily illustrates a microphone array configuration showinga microphone array 201 having N sound sensors 301 arbitrarilydistributed on a circle 302 with a diameter “d”, where “N” refers to thenumber of sound sensors 301 in the microphone array 201. Consider anexample where N=4, that is, there are four sound sensors 301 M₀, M₁, M₂,and M₃ in the microphone array 201. Each of the sound sensors 301 ispositioned at an acute angle “Φ_(n)” from a Y-axis, where Φ₁≥0 and n=0,1, 2, . . . N−1. In an example, the sound sensor 301 M₀ is positioned atan acute angle Φ₀ from the Y-axis; the sound sensor 301 M₁ is positionedat an acute angle Φ₁ from the Y-axis; the sound sensor 301 M₂ ispositioned at an acute angle Φ₂ from the Y-axis; and the sound sensor301 M₃ is positioned at an acute angle Φ₃ from the Y-axis. Afilter-and-sum beamforming algorithm determines the output “y” of themicrophone array 201 having N sound sensors 301 as disclosed in thedetailed description of FIG. 4.

FIG. 4 exemplarily illustrates a graphical representation of thefilter-and-sum beamforming algorithm for determining the output of themicrophone array 201 having N sound sensors 301. Consider an examplewhere the target sound signal from the target sound source is at anangle θ with a normalized frequency w. The microphone arrayconfiguration is arbitrary in a two dimensional plane, for example, acircular array configuration where the sound sensors 301 M₀, M₁, M₂, . .. , M_(N), M_(N−1) of the microphone array 201 are arbitrarilypositioned on a circle 302. The sound signals received by each of thesound sensors 301 in the microphone array 201 are inputs to themicrophone array 201. The adaptive beamforming unit 203 employs thefilter-and-sum beamforming algorithm that applies independent weights toeach of the inputs to the microphone array 201 such that directivitypattern of the microphone array 201 is steered to the spatial locationof the target sound signal as determined by the sound sourcelocalization unit 202.

The output “y” of the microphone array 201 having N sound sensors 301 isthe filter-and-sum of the outputs of the N sound sensors 301. That is,y=Σ_(n=0) ^(N−1)w_(n) ^(T)x_(n), where x_(n) is the output of the(n+1)^(th) sound sensor 301, and w_(n) ^(T) denotes a transpose of alength-L filter applied to the (n+1)^(th) sound sensor 301.

The spatial directivity pattern H (ω, θ) for the target sound signalfrom angle θ with normalized frequency w is defined as:

$\begin{matrix}{{H\left( {\omega,\;\theta} \right)}\; = \;{\frac{Y\left( {\omega,\;\theta} \right)}{\overset{\_}{X}\left( {\omega,\;\theta} \right)}\; = \;\frac{\sum\limits_{n\; = \; 0}^{N\; - \; 1}{{W_{n}(\omega)}\;{X_{n}\left( {\omega,\;\theta} \right)}}}{\overset{\_}{X}\left( {\omega,\;\theta} \right)}}} & (1)\end{matrix}$where X is the signal received at the origin of the circular microphonearray 201 and W is the frequency response of the real-valued finiteimpulse response (FIR) filter w. If the target sound source is farenough away from the microphone array 201, the difference between thesignal received by the (n+1)^(th) sound sensor 301 “x_(n)” and theorigin of the microphone array 201 is a delay τ_(n); that is,X_(n)(ω,τ)=X(ω, θ)e^(−jωτ) ^(n) .

FIG. 5 exemplarily illustrates distances between an origin of themicrophone array 201 and the sound sensor 301 M₁ and the sound sensor301 M₃ in the circular microphone array configuration, when the targetsound signal is at an angle θ from the Y-axis. The microphone arraysystem 200 disclosed herein can be used with an arbitrary directivitypattern for arbitrarily distributed sound sensors 301. For any specificmicrophone array configuration, the parameter that is defined to achievebeamformer coefficients is the value of delay τ_(n) for each soundsensor 301. To define the value of τ_(n), an origin or a reference pointof the microphone array 201 is defined; and then the distance d_(n)between each sound sensor 301 and the origin is measured, and then theangle Φ_(n) of each sound sensor 301 biased from a vertical axis ismeasured.

For example, the angle between the Y-axis and the line joining theorigin and the sound sensor 301 M₀ is Φ₀, the angle between the Y-axisand the line joining the origin and the sound sensor 301 M₁ is Φ₁, theangle between the Y-axis and the line joining the origin and the soundsensor 301 M₂ is Φ₂, and the angle between the Y-axis and the linejoining the origin and the sound sensor 301 M₃ is Φ₃. The distancebetween the origin ◯ and the sound sensor 301 M₁, and the origin ◯ andthe sound sensor 301 M₃ when the incoming target sound signal from thetarget sound source is at an angle θ from the Y-axis is denoted as τ₁and τ₃, respectively.

For purposes of illustration, the detailed description refers to acircular microphone array configuration; however, the scope of themicrophone array system 200 disclosed herein is not limited to thecircular microphone array configuration but may be extended to include alinear array configuration, an arbitrarily distributed coplanar arrayconfiguration, or a microphone array configuration with any arbitrarygeometry.

FIG. 6A exemplarily illustrates a table showing the distance betweeneach sound sensor 301 in a circular microphone array configuration fromthe origin of the microphone array 201, when the target sound source isin the same plane as that of the microphone array 201. The distancemeasured in meters and the corresponding delay (τ) measured in number ofsamples is exemplarily illustrated in FIG. 6A. In an embodiment wherethe target sound source that emits the target sound signal is in a twodimensional plane, the delay (τ) between each of the sound sensors 301and the origin of the microphone array 201 is determined as a functionof distance (d) between each of the sound sensors 301 and the origin, apredefined angle (Φ) between each of the sound sensors 301 and areference axis (Y) as exemplarily illustrated in FIG. 5, and an azimuthangle (θ) between the reference axis (Y) and the target sound signal.The determined delay (τ) is represented in terms of number of samples.

If the target sound source is far enough from the microphone array 201,the time delay between the signal received by the (n+1)^(th) soundsensor 301 “x_(n),” and the origin of the microphone array 201 is hereindenoted as “t” measured in seconds. The sound signals received by themicrophone array 201, which are in analog form are converted intodigital sound signals by sampling the analog sound signals at aparticular frequency, for example, 8000 Hz. That is, the number ofsamples in each second is 8000. The delay τ can be represented as theproduct of the sampling frequency (f_(s)) and the time delay (t). Thatis, τ=f_(s)*t. Therefore, the distance between the sound sensors 301 inthe microphone array 201 corresponds to the time used for the targetsound signal to travel the distance and is measured by the number ofsamples within that time period.

Consider an example where “d” is the radius of the circle 302 of thecircular microphone array configuration, “f_(s)” is the samplingfrequency, and “c” is the speed of sound. FIG. 6B exemplarilyillustrates a table showing the relationship of the position of eachsound sensor 301 in the circular microphone array configuration and itsdistance to the origin of the microphone array 201, when the targetsound source is in the same plane as that of the microphone array 201.The distance measured in meters and the corresponding delay (τ) measuredin number of samples is exemplarily illustrated in FIG. 6B.

The method of determining the delay (τ) enables beamforming forarbitrary numbers of sound sensors 301 and multiple arbitrary microphonearray configurations. Once the delay (τ) is determined, the microphonearray 201 can be aligned to enhance the target sound signal from aspecific direction.

Therefore, the spatial directivity pattern H can be re-written as:H(ω,θ)=Σ_(n=0) ^(N−1)W_(n)(ω)e^(−jωτ) ^(n) ^((θ))=w^(T)g(ω,θ)  (2)where w^(T)=[w₀ ^(T), w₁ ^(T), w₂ ^(T), w₃ ^(T), . . . , w_(N−1) ^(T)]and g(ω,θ)={g^(i)(ω, θ)}_(i=1 . . . NL)={e^(−jω(k+τ) ^(n)^((θ)))}_(i=1 . . . NL) is the steering vector, i=1 . . . NL, andk=mod(i−1,L) and n=floor ((i−1)/L).

FIGS. 7A-7C exemplarily illustrate an embodiment of a microphone array201 when the target sound source is in a three dimensional plane. In anembodiment where the target sound source that emits the target soundsignal is in a three dimensional plane, the delay (τ) between each ofthe sound sensors 301 and the origin of the microphone array 201 isdetermined as a function of distance (d) between each of the soundsensors 301 and the origin, a predefined angle (Φ) between each of thesound sensors 301 and a first reference axis (Y), an elevation angle (Ψ)between a second reference axis (Z) and the target sound signal, and anazimuth angle (θ) between the first reference axis (Y) and the targetsound signal. The determined delay (τ) is represented in terms of numberof samples. The determination of the delay enables beamforming forarbitrary numbers of the sound sensors 301 and multiple arbitraryconfigurations of the microphone array 201.

Consider an example of a microphone array configuration with four soundsensors 301 M₀, M₁, M₂, and M₃. FIG. 7A exemplarily illustrates agraphical representation of a microphone array 201, when the targetsound source in a three dimensional plane. As exemplarily illustrated inFIG. 7A, the target sound signal from the target sound source isreceived from the direction (Ψ, θ) with reference to the origin of themicrophone array 201, where Ψ is the elevation angle and θ is theazimuth.

FIG. 7B exemplarily illustrates a table showing delay between each soundsensor 301 in a circular microphone array configuration and the originof the microphone array 201, when the target sound source is in a threedimensional plane. The target sound source in a three dimensional planeemits a target sound signal from a spatial location (Ψ, θ). Thedistances between the origin ◯ and the sound sensors 301 M₀, M₁, M₂, andM₃ when the incoming target sound signal from the target sound source isat an angle (Ψ, θ) from the Z-axis and the Y-axis respectively, aredenoted as τ₀, τ₁, τ₂, and τ₃ respectively. When the spatial location ofthe target sound signal moves from the location Ψ=90° to a locationΨ=0°, sin(Ψ) changes from 1 to 0, and as a result, the differencebetween each sound sensor 301 in the microphone array 201 becomessmaller and smaller. When Ψ=0°, there is no difference between the soundsensors 301, which implies that the target sound signal reaches eachsound sensor 301 at the same time. Taking into account that the sampledelay between the sound sensors 301 can only be an integer, the rangewhere all the sound sensors 301 are identical is determined.

FIG. 7C exemplarily illustrates a three dimensional working space of themicrophone array 201, where the target sound signal is incident at anelevation angle Ψ<Ω, where Ω is a specific angle and is a variablerepresenting the elevation angle. When the target sound signal isincident at an elevation angle Ψ<Ω, all four sound sensors 301 M₀, M₁,M₂, and M₃ receive the same target sound signal for 0°<0<360°. The delayτ is a function of both the elevation angle Ψ and the azimuth angle θ.That is, τ=τ(θ, Ψ). As used herein, Ω refers to the elevation angle suchthat all τ_(i) (θ, Ω) are equal to each other, where i=0, 1, 2, 3, etc.The value of Ω is determined by the sample delay between each of thesound sensors 301 and the origin of the microphone array 201. Theadaptive beamforming unit 203 enhances sound from this range andsuppresses sound signals from other directions, for example, S₁ and S₂treating them as ambient noise signals.

Consider a least mean square solution for beamforming according to themethod disclosed herein. Let the spatial directivity pattern be 1 in thepassband and 0 in the stopband. The least square cost function isdefined as:

$\begin{matrix}\begin{matrix}{{J(w)} = {{\int_{\Omega_{p}}{\int_{\Theta_{p}}{{{{H\left( {\omega,\theta} \right)} - 1}}^{2}d\;\omega\; d\;\theta}}} +}} \\{\alpha{\int_{\Omega_{s}}{\int_{\Theta_{s}}{{{H\left( {\omega,\theta} \right)}}^{2}d\;\omega\; d\;\theta}}}} \\{= {{\int_{\Omega_{p}}{\int_{\Theta_{p}}{{{H\left( {\omega,\theta} \right)}}^{2}d\;\omega\; d\;\theta}}} +}} \\{{\alpha{\int_{\Omega_{s}}{\int_{\Theta_{x}}{{{H\left( {\omega,\theta} \right)}}^{2}d\;\omega\; d\;\theta}}}} -} \\{{2{\int_{\Omega_{p}}{\int_{\Theta_{p}}{{{Re}\left( {H\left( {\omega,\theta} \right)} \right)}d\;\omega\; d\;\theta}}}} +} \\{\int_{\Omega_{p}}{\int_{\Theta_{p}}{1d\;\omega\; d\;\theta}}}\end{matrix} & (3)\end{matrix}$Replacing|H(ω,θ)|²=w^(T)g(ω,θ)g^(H)(ω,θ)w=w^(T)(G_(R)(ω,θ)+jG₁(ω,θ))w=w^(T)G_(R)(ω,θ)wand Re(H(ω,θ))=w^(T)g_(R)(ω,θ), J(ω) becomesJ(ω)=w^(T)Qw−2w^(T)α+d, whereQ=∫_(Ω) _(P) ∫₇₃ _(P) G_(R)(ω,θ)dωdθ+αθ_(Ω) _(S) ∫_(Θ) _(S)G_(R)(ω,θ)dωdθα=∫_(Ω) _(P) ∫_(Θ) _(P) g_(R)(ω,θ)dωdθd=∫_(Ω) _(P) ∫_(Θ) _(P) 1dωdθwhere g_(R)(ω,θ)=cos [w(k+τ_(n))] and G_(R)(ω,θ)=cos[w(k−1+τ_(n)−τ_(m))].

When ∂J/∂w=0, the cost function J is minimized. The least-squareestimate of w is obtained by:w=Q⁻¹α  (5)

Applying linear constrains Cw=b, the spatial response is furtherconstrained to a predefined value b at angle θ_(f) using followingequation:

$\begin{matrix}{{\begin{bmatrix}{g_{R}^{T}\left( {\omega_{start},\;\theta_{f}} \right)} \\\ldots \\{g_{R}^{T}\left( {\omega_{end},\;\theta_{f}} \right)}\end{bmatrix}w} = \begin{bmatrix}b_{start} \\\ldots \\b_{end}\end{bmatrix}} & (6)\end{matrix}$Now, the design problem becomes:

$\begin{matrix}{{{\min\limits_{w}\;{w^{T}\;{Qw}}} - {2\mspace{11mu} w^{T}\; a} + {d\mspace{14mu}{subject}\mspace{14mu}{to}\mspace{14mu}{Cw}}} = b} & (7)\end{matrix}$and the solution of the constrained minimization problem is equal to:w=Q⁻¹C^(T)(CQ⁻¹C^(T))⁻¹(b−CQ⁻¹α)+Q⁻¹α  (8)where w is the filter parameter for the designed adaptive beamformingunit 203.

In an embodiment, the beamforming is performed by a delay-sum method. Inanother embodiment, the beamforming is performed by a filter-sum method.

FIG. 8 exemplarily illustrates a method for estimating a spatiallocation of the target sound signal from the target sound source by thesound source localization unit 202 using a steered response power-phasetransform (SRP-PHAT). The SRP-PHAT combines the advantages of soundsource localization methods, for example, the time difference of arrival(TDOA) method and the steered response power (SRP) method. The TDOAmethod performs the time delay estimation of the sound signals relativeto a pair of spatially separated sound sensors 301. The estimated timedelay is a function of both the location of the target sound source andthe position of each of the sound sensors 301 in the microphone array201. Because the position of each of the sound sensors 301 in themicrophone array 201 is predefined, once the time delay is estimated,the location of the target sound source can be determined. In the SRPmethod, a filter-and-sum beamforming algorithm is applied to themicrophone array 201 for sound signals in the direction of each of thedisparate sound sources. The location of the target sound sourcecorresponds to the direction in which the output of the filter-and-sumbeamforming has the largest response power. The TDOA based localizationis suitable under low to moderate reverberation conditions. The SRPmethod requires shorter analysis intervals and exhibits an elevatedinsensitivity to environmental conditions while not allowing for useunder excessive multi-path. The SRP-PHAT method disclosed hereincombines the advantages of the TDOA method and the SRP method, has adecreased sensitivity to noise and reverberations compared to the TDOAmethod, and provides more precise location estimates than existinglocalization methods.

For direction i (0≤t≤360), the delay D_(it) is calculated 801 betweenthe t^(th) pair of the sound sensors 301 (t=1: all pairs). Thecorrelation value corr(D_(it)) between the t^(th) pair of the soundsensors 301 corresponding to the delay of D_(it) is then calculated 802.For the direction i (0≤i≤360), the correlation value is given 803 by:

${CORR}_{i} = {\sum\limits_{t\; = \; 1}^{{ALL}\mspace{14mu}{PAIR}}{{corr}\left( D_{it} \right)}}$Therefore, the spatial location of the target sound signal is given 804by:

$S = {\underset{0\;<=\; i\;<=\; 360}{\arg\max}{{CORR}_{i}.}}$

FIGS. 9A-9B exemplarily illustrate graphs showing the results of soundsource localization performed using the steered response power-phasetransform (SRP-PHAT). FIG. 9A exemplarily illustrates a graph showingthe value of the SRP-PHAT for every 10° The maximum value corresponds tothe location of the target sound signal from the target sound source.FIG. 9B exemplarily illustrates a graph representing the estimatedtarget sound signal from the target sound source and a ground truth.

FIG. 10 exemplarily illustrates a system for performing adaptivebeamforming by the adaptive beamforming unit 203. The algorithm forfixed beamforming is disclosed with reference to equations (3) through(8) in the detailed description of FIG. 4, FIGS. 6A-6B, and FIGS. 7A-7C,which is extended herein to adaptive beamforming. Adaptive beamformingrefers to a beamforming process where the directivity pattern of themicrophone array 201 is adaptively steered in the direction of a targetsound signal emitted by a target sound source in motion. Adaptivebeamforming achieves better ambient noise suppression than fixedbeamforming. This is because the target direction of arrival, which isassumed to be stable in fixed beamforming, changes with the movement ofthe target sound source. Moreover, the gains of the sound sensors 301which are assumed uniform in fixed beamforming, exhibit significantdistribution. All these factors reduce speech quality. On the otherhand, adaptive beamforming adaptively performs beam steering and nullsteering; therefore, the adaptive beamforming method is more robustagainst steering error caused by the array imperfection mentioned above.

As exemplarily illustrated in FIG. 10, the adaptive beamforming unit 203disclosed herein comprises a fixed beamformer 204, a blocking matrix205, an adaptation control unit 208, and an adaptive filter 206. Thefixed beamformer 204 adaptively steers the directivity pattern of themicrophone array 201 in the direction of the spatial location of thetarget sound signal from the target sound source for enhancing thetarget sound signal, when the target sound source is in motion. Thesound sensors 301 in the microphone array 201 receive the sound signalsS₁, . . . , S₄, which comprise both the target sound signal from thetarget sound source and the ambient noise signals. The received soundsignals are fed as input to the fixed beamformer 204 and the blockingmatrix 205. The fixed beamformer 204 outputs a signal “b”. In anembodiment, the fixed beamformer 204 performs fixed beamforming byfiltering and summing output sound signals from the sound sensors 301.The blocking matrix 205 outputs a signal “z” which primarily comprisesthe ambient noise signals. The blocking matrix 205 blocks the targetsound signal from the target sound source and feeds the ambient noisesignals to the adaptive filter 206 to minimize the effect of the ambientnoise signals on the enhanced target sound signal.

The output “z” of the blocking matrix 205 may contain some weak targetsound signals due to signal leakage. If the adaptation is active whenthe target sound signal, for example, speech is present, the speech iscancelled out with the noise. Therefore, the adaptation control unit 208determines when the adaptation should be applied. The adaptation controlunit 208 comprises a target sound signal detector 208a and a step sizeadjusting module 208b. The target sound signal detector 208a of theadaptation control unit 208 detects the presence or absence of thetarget sound signal, for example, speech. The step size adjusting module208b adjusts the step size for the adaptation process such that when thetarget sound signal is present, the adaptation is slow for preservingthe target sound signal, and when the target sound signal is absent,adaptation is quick for better cancellation of the ambient noisesignals.

The adaptive filter 206 is a filter that adaptively updates filtercoefficients of the adaptive filter 206 so that the adaptive filter 206can be operated in an unknown and changing environment. The adaptivefilter 206 adaptively filters the ambient noise signals in response todetecting presence or absence of the target sound signal in the soundsignals received from the disparate sound sources. The adaptive filter206 adapts its filter coefficients with the changes in the ambient noisesignals, thereby eliminating distortion in the target sound signal, whenthe target sound source and the ambient noise signals are in motion. Inan embodiment, the adaptive filtering is performed by a set of sub-bandadaptive filters using sub-band adaptive filtering as disclosed in thedetailed description of FIG. 11.

FIG. 11 exemplarily illustrates a system for sub-band adaptivefiltering. Sub-band adaptive filtering involves separating a full-bandsignal into different frequency ranges called sub-bands prior to thefiltering process. The sub-band adaptive filtering using sub-bandadaptive filters lead to a higher convergence speed compared to using afull-band adaptive filter. Moreover, the noise reduction unit 207disclosed herein is developed in a sub-band, whereby applying sub-bandadaptive filtering provides the same sub-band framework for bothbeamforming and noise reduction, and thus saves on computational cost.

As exemplarily illustrated in FIG. 11, the adaptive filter 206 comprisesan analysis filter bank 206a, an adaptive filter matrix 206b, and asynthesis filter bank 206c. The analysis filter bank 206a splits theenhanced target sound signal (b) from the fixed beamformer 204 and theambient noise signals (z) from the blocking matrix 205 exemplarilyillustrated in FIG. 10 into multiple frequency sub-bands. The analysisfilter bank 206a performs an analysis step where the outputs of thefixed beamformer 204 and the blocking matrix 205 are split intofrequency sub bands. The sub-band adaptive filter 206 typically has ashorter impulse response than its full band counterpart. The step sizeof the sub-bands can be adjusted individually for each sub-band by thestep-size adjusting module 208b, which leads to a higher convergencespeed compared to using a full band adaptive filter.

The adaptive filter matrix 206b adaptively filters the ambient noisesignals in each of the frequency sub-bands in response to detecting thepresence or absence of the target sound signal in the sound signalsreceived from the disparate sound sources. The adaptive filter matrix206b performs an adaptation step, where the adaptive filter 206 isadapted such that the filter output only contains the target soundsignal, for example, speech. The synthesis filter bank 206c synthesizesa full-band sound signal using the frequency sub-bands of the enhancedtarget sound signal. The synthesis filter bank 206c performs a synthesisstep where the sub-band sound signal is synthesized into a full-bandsound signal. Since the noise reduction and the beamforming areperformed in the same sub-band framework, the noise reduction asdisclosed in the detailed description of FIG. 13, by the noise reductionunit 207 is performed prior to the synthesis step, thereby reducingcomputation.

In an embodiment, the analysis filter bank 206a is implemented as aperfect-reconstruction filter bank, where the output of the synthesisfilter bank 206c after the analysis and synthesis steps perfectlymatches the input to the analysis filter bank 206a. That is, all thesub-band analysis filter banks 206a are factorized to operate onprototype filter coefficients and a modulation matrix is used to takeadvantage of the fast Fourier transform (FFT). Both analysis andsynthesize steps require performing frequency shifts in each sub-band,which involves complex value computations with cosines and sinusoids.The method disclosed herein employs the FFT to perform the frequencyshifts required in each sub-band, thereby minimizing the amount ofmultiply-accumulate operations. The implementation of the sub-bandanalysis filter bank 206a as a perfect-reconstruction filter bankensures the quality of the target sound signal by ensuring that thesub-band analysis filter banks 206a do not distort the target soundsignal itself.

FIG. 12 exemplarily illustrates a graphical representation showing theperformance of a perfect-reconstruction filter bank. The solid linerepresents the input signal to the analysis filter bank 206a, and thecircles represent the output of the synthesis filter bank 206c afteranalysis and synthesis. As exemplarily illustrated in FIG. 12, theoutput of the synthesis filter bank 206c perfectly matches the input,and is therefore referred to as the perfect-reconstruction filter bank.

FIG. 13 exemplarily illustrates a block diagram of a noise reductionunit 207 for performing noise reduction using, for example, aWiener-filter based noise reduction algorithm. The noise reduction unit207 performs noise reduction for further suppressing the ambient noisesignals after adaptive beamforming, for example, by using aWiener-filter based noise reduction algorithm, a spectral subtractionnoise reduction algorithm, an auditory transform based noise reductionalgorithm, or a model based noise reduction algorithm. In an embodiment,the noise reduction unit 207 performs noise reduction in multiplefrequency sub-bands employed by an analysis filter bank 206a of theadaptive beamforming unit 203 for sub-band adaptive beamforming.

In an embodiment, the noise reduction is performed using theWiener-filter based noise reduction algorithm. The noise reduction unit207 explores the short-term and long-term statistics of the target soundsignal, for example, speech, and the ambient noise signals, and thewide-band and narrow-band signal-to-noise ratio (SNR) to support aWiener gain filtering. The noise reduction unit 207 comprises a targetsound signal statistics analyzer 207a, a noise statistics analyzer 207b,a signal-to-noise ratio (SNR) analyzer 207c, and a Wiener filter 207d.The target sound signal statistics analyzer 207a explores the short-termand long-term statistics of the target sound signal, for example,speech. Similarly, the noise statistics analyzer 207b explores theshort-term and long-term statistics of the ambient noise signals. TheSNR analyzer 207c of the noise reduction unit 207 explores the wide-bandand narrow-band signal-to-noise ratio (SNR). After the spectrum ofnoisy-speech passes through the Wiener filter 207d, an estimation of theclean-speech spectrum is generated. The synthesis filter bank 206c, byan inverse process of the analysis filter bank 206a, reconstructs thesignals of the clean speech into a full-band signal, given the estimatedspectrum of the clean speech.

FIG. 14 exemplarily illustrates a hardware implementation of themicrophone array system 200 disclosed herein. The hardwareimplementation of the microphone array system 200 disclosed in thedetailed description of FIG. 2 comprises the microphone array 201 havingan arbitrary number of sound sensors 301 positioned in an arbitraryconfiguration, multiple microphone amplifiers 1401, one or more audiocodecs 1402, a digital signal processor (DSP) 1403, a flash memory 1404,one or more power regulators 1405 and 1406, a battery 1407, aloudspeaker or a headphone 1408, and a communication interface 1409. Themicrophone array 201 comprises, for example, four or eight sound sensors301 arranged in a linear or a circular microphone array configuration.The microphone array 201 receives the sound signals.

Consider an example where the microphone array 201 comprises four soundsensors 301 that pick up the sound signals. Four microphone amplifiers1401 receive the output sound signals from the four sound sensors 301.The microphone amplifiers 1401 also referred to as preamplifiers providea gain to boost the power of the received sound signals for enhancingthe sensitivity of the sound sensors 301. In an example, the gain of thepreamplifiers is 20 dB.

The audio codec 1402 receives the amplified output from the microphoneamplifiers 1401. The audio codec 1402 provides an adjustable gain level,for example, from about −74 dB to about 6 dB. The received sound signalsare in an analog form. The audio codec 1402 converts the four channelsof the sound signals in the analog form into digital sound signals. Thepre-amplifiers may not be required for some applications. The audiocodec 1402 then transmits the digital sound signals to the DSP 1403 forprocessing of the digital sound signals. The DSP 1403 implements thesound source localization unit 202, the adaptive beamforming unit 203,and the noise reduction unit 207.

After the processing, the DSP 1403 either stores the processed signalfrom the DSP 1403 in a memory device for a recording application, ortransmits the processed signal to the communication interface 1409. Therecording application comprises, for example, storing the processedsignal onto the memory device for the purposes of playing back theprocessed signal at a later time. The communication interface 1409transmits the processed signal, for example, to a computer, theinternet, or a radio for communicating the processed signal. In anembodiment, the microphone array system 200 disclosed herein implementsa two-way communication device where the signal received from thecommunication interface 1409 is processed by the DSP 1403 and theprocessed signal is then played through the loudspeaker or the headphone1408.

The flash memory 1404 stores the code for the DSP 1403 and compressedaudio signals. When the microphone array system 200 boots up, the DSP1403 reads the code from the flash memory 1404 into an internal memoryof the DSP 1403 and then starts executing the code. In an embodiment,the audio codec 1402 can be configured for encoding and decoding audioor sound signals during the start up stage by writing to registers ofthe DSP 1403. For an eight-sensor microphone array 201, two four-channelaudio codec 1402 chips may be used. The power regulators 1405 and 1406,for example, linear power regulators 1405 and switch power regulators1406 provide appropriate voltage and current supply for all thecomponents, for example, 201, 1401, 1402, 1403, etc., mechanicallysupported and electrically connected on a circuit board. A universalserial bus (USB) control is built into the DSP 1403. The battery 1407 isused for powering the microphone array system 200.

Consider an example where the microphone array system 200 disclosedherein is implemented on a mixed signal circuit board having a six-layerprinted circuit board (PCB). Noisy digital signals easily contaminatethe low voltage analog sound signals from the sound sensors 301.Therefore, the layout of the mixed signal circuit board is carefullypartitioned to isolate the analog circuits from the digital circuits.Although both the inputs and outputs of the microphone amplifiers 1401are in analog form, the microphone amplifiers 1401 are placed in adigital region of the mixed signal circuit board because of their highpower consumption 1401 and switch amplifier nature.

The linear power regulators 1405 are deployed in an analog region of themixed signal circuit board due to the low noise property exhibited bythe linear power regulators 1405. Five power regulators, for example,1405 are designed in the microphone array system 200 circuits to ensurequality. The switch power regulators 1406 achieve an efficiency of about95% of the input power and have high output current capacity; howevertheir outputs are too noisy for analog circuits. The efficiency of thelinear power regulators 1405 is determined by the ratio of the outputvoltage to the input voltage, which is lower than that of the switchpower regulators 1406 in most cases. The regulator outputs utilized inthe microphone array system 200 circuits are stable, quiet, and suitablefor the low power analog circuits.

In an example, the microphone array system 200 is designed with amicrophone array 201 having dimensions of 10 cm×2.5 cm×1.5 cm, a USBinterface, and an assembled PCB supporting the microphone array 201 anda DSP 1403 having a low power consumption design devised for portabledevices, a four-channel codec 1402, and a flash memory 1404. The DSP1403 chip is powerful enough to handle the DSP 1403 computations in themicrophone array system 200 disclosed herein. The hardware configurationof this example can be used for any microphone array configuration, withsuitable modifications to the software. In an embodiment, the adaptivebeamforming unit 203 of the microphone array system 200 is implementedas hardware with software instructions programmed on the DSP 1403. TheDSP 1403 is programmed for beamforming, noise reduction, echocancellation, and USB interfacing according to the method disclosedherein, and fine tuned for optimal performance.

FIGS. 15A-15C exemplarily illustrate a conference phone 1500 comprisingan eight-sensor microphone array 201. The eight-sensor microphone array201 comprises eight sound sensors 301 arranged in a configuration asexemplarily illustrated in FIG. 15A. A top view of the conference phone1500 comprising the eight-sensor microphone array 201 is exemplarilyillustrated in FIG. 15A. A front view of the conference phone 1500comprising the eight-sensor microphone array 201 is exemplarilyillustrated in FIG. 15B. A headset 1502 that can be placed in a baseholder 1501 of the conference phone 1500 having the eight-sensormicrophone array 201 is exemplarily illustrated in FIG. 15C. In additionto a conference phone 1500, the microphone array system 200 disclosedherein with broadband beamforming can be configured for a mobile phone,a tablet computer, etc., for speech enhancement and noise reduction.

FIG. 16A exemplarily illustrates a layout of an eight-sensor microphonearray 201 for a conference phone 1500. Consider an example of a circularmicrophone array 201 in which eight sound sensors 301 are mounted on thesurface of the conference phone 1500 as exemplarily illustrated in FIG.15A. The conference phone 1500 has a removable handset 1502 on top, andhence the microphone array system 200 is configured to accommodate thehandset 1502 as exemplarily illustrated in FIGS. 15A-15C. In an example,the circular microphone array 201 has a diameter of about four inches.Eight sound sensors 301, for example, microphones, M₀, M₁, M₂, M₃, M₄,M₅, M₆, and M₇ are distributed along a circle 302 on the conferencephone 1500. Microphones M₄-M₇ are separated by 90 degrees from eachother, and microphones M₀-M₃ are rotated counterclockwise by 60 degreesfrom microphone M₄-M₇ respectively.

FIG. 16B exemplarily illustrates a graphical representation of eightspatial regions to which the eight-sensor microphone array 201 of FIG.16A responds. The space is divided into eight spatial regions with equalspaces centered at 15°, 60°, 105°, 150°, 195°, 240°, 285°, and 330°respectively. The adaptive beamforming unit 203 configures theeight-sensor microphone array 201 to automatically point to one of theseeight spatial regions according to the location of the target soundsignal from the target sound source as estimated by the sound sourcelocalization unit 202.

FIGS. 16C-16D exemplarily illustrate computer simulations showing thesteering of the directivity patterns of the eight-sensor microphonearray 201 of FIG. 16A, in the directions 15° and 60° respectively, inthe frequency range 300 Hz to 5 kHz. FIG. 16C exemplarily illustratesthe computer simulation result showing the directivity pattern of themicrophone array 201 when the target sound signal is received from thetarget sound source in the spatial region centered at 15°.

The computer simulation for verifying the performance of the adaptivebeamforming unit 203 when the target sound signal is received from thetarget sound source in the spatial region centered at 15° uses thefollowing parameters:

Sampling frequency fs=16 k,

FIR filter taper length L=20

Passband (Θ_(p), Ω_(p))={300-5000 Hz, −5°-35°}, designed spatialdirectivity pattern is 1.

Stopband (Θ_(s), Ω_(s))={300˜5000 Hz,−180°˜−15°+45°˜180°}, the designedspatial directivity pattern is 0.

It can be seen that the directivity pattern of the microphone array 201in the spatial region centered at 15° is enhanced while the soundsignals from all other spatial regions are suppressed.

FIG. 16D exemplarily illustrates the computer simulation result showingthe directivity pattern of the microphone array 201 when the targetsound signal is received from the target sound source in the spatialregion centered at 60°. The computer simulation for verifying theperformance of the adaptive beamforming unit 203 when the target soundsignal is received from the target sound source in the spatial regioncentered at 60° uses the following parameters:

Sampling frequency fs=16 k,

FIR filter taper length L=20

Passband (Θ_(p), Ω_(p))={300-5000 Hz, 40°-80°}, designed spatialdirectivity pattern is 1.

Stopband (Θ_(s), Ω_(s))={300˜5000 Hz, −180°˜30°+90°˜180°}, the designedspatial directivity pattern is 0.

It can be seen that the directivity pattern of the microphone array 201in the spatial region centered at 60° is enhanced while the soundsignals from all other spatial regions are suppressed. The other sixspatial regions have similar parameters. Moreover, in all frequencies,the main lobe has the same level, which means the target sound signalhas little distortion in frequency.

FIGS. 16E-16L exemplarily illustrate graphical representations showingthe directivity patterns of the eight-sensor microphone array 201 ofFIG. 16A in each of the eight spatial regions, where each directivitypattern is an average response from 300 Hz to 5000 Hz. The main lobe isabout 10 dB higher than the side lobe, and therefore the ambient noisesignals from other directions are highly suppressed compared to thetarget sound signal in the pass direction. The microphone array system200 calculates the filter coefficients for the target sound signal, forexample, speech signals from each sound sensor 301 and combines thefiltered signals to enhance the speech from any specific direction.Since speech covers a large range of frequencies, the method and system200 disclosed herein covers broadband signals from 300 Hz to 5000 Hz.

FIG. 16E exemplarily illustrates a graphical representation showing thedirectivity pattern of the eight-sensor microphone array 201 when thetarget sound signal is received from the target sound source in thespatial region centered at 15°. FIG. 16F exemplarily illustrates agraphical representation showing the directivity pattern of theeight-sensor microphone array 201 when the target sound signal isreceived from the target sound source in the spatial region centered at60°. FIG. 16G exemplarily illustrates a graphical representation showingthe directivity pattern of the eight-sensor microphone array 201 whenthe target sound signal is received from the target sound source in thespatial region centered at 105°. FIG. 16H exemplarily illustrates agraphical representation showing the directivity pattern of theeight-sensor microphone array 201 when the target sound signal isreceived from the target sound source in the spatial region centered at150°. FIG. 16I exemplarily illustrates a graphical representationshowing the directivity pattern of the eight-sensor microphone array 201when the target sound signal is received from the target sound source inthe spatial region centered at 195°. FIG. 16J exemplarily illustrates agraphical representation showing the directivity pattern of theeight-sensor microphone array 201 when the target sound signal isreceived from the target sound source in the spatial region centered at240°. FIG. 16K exemplarily illustrates a graphical representationshowing the directivity pattern of the eight-sensor microphone array 201when the target sound signal is received from the target sound source inthe spatial region centered at 285°. FIG. 16L exemplarily illustrates agraphical representation showing the directivity pattern of theeight-sensor microphone array 201 when the target sound signal isreceived from the target sound source in the spatial region centered at330°. The microphone array system 200 disclosed herein enhances thetarget sound signal from each of the directions 15°, 60°, 105°, 150°,195°, 240°, 285°, and 330°, while suppressing the ambient noise signalsfrom the other directions.

The microphone array system 200 disclosed herein can be implemented fora square microphone array configuration and a rectangular arrayconfiguration where a sound sensor 301 is positioned in each corner ofthe four-cornered array. The microphone array system 200 disclosedherein implements beamforming from plane to three dimensional soundsources.

FIG. 17A exemplarily illustrates a graphical representation of fourspatial regions to which a four-sensor microphone array 201 for awireless handheld device responds. The wireless handheld device is, forexample, a mobile phone. Consider an example where the microphone array201 comprises four sound sensors 301, for example, microphones,uniformly distributed around a circle 302 having diameter equal to abouttwo inches. This configuration is identical to positioning four soundsensors 301 or microphones on four corners of a square. The space isdivided into four spatial regions with equal space centered at −90°, 0°,90°, and 180° respectively. The adaptive beamforming unit 203 configuresthe four-sensor microphone array 201 to automatically point to one ofthese spatial regions according to the location of the target soundsignal from the target sound source as estimated by the sound sourcelocalization unit 202.

FIGS. 17B-17I exemplarily illustrate computer simulations showing thedirectivity patterns of the four-sensor microphone array 201 of FIG. 17Awith respect to azimuth and frequency. The results of the computersimulations performed for verifying the performance of the adaptivebeamforming unit 203 of the microphone array system 200 disclosed hereinfor a sampling frequency f_(s)=16 k and FIR filter taper length L=20,are as follows:

For the spatial region centered at 0°:

Passband (Θ_(p), Ω_(p))={300-4000 Hz, −20°-20°}, designed spatialdirectivity pattern is 1.

Stopband (Θ, Ω_(s))={300˜4000 Hz, −180°˜−30°+30°˜180°}, the designedspatial directivity pattern is 0.

For the spatial region centered at 90°:

Passband (Θ_(p), Ω_(p))={300-4000 Hz, 70°-110°}, designed spatialdirectivity pattern is 1.

Stopband (Θ_(s), Ω_(s))={300˜4000 Hz, −180°˜60°+120°˜180°}, the designedspatial directivity pattern is 0. The directivity patterns for thespatial regions centered at −90° and 180° are similarly obtained.

FIG. 17B exemplarily illustrates the computer simulation resultrepresenting a three dimensional (3D) display of the directivity patternof the four-sensor microphone array 201 when the target sound signal isreceived from the target sound source in the spatial region centered at−90°. FIG. 17C exemplarily illustrates the computer simulation resultrepresenting a 2D display of the directivity pattern of the four-sensormicrophone array 201 when the target sound signal is received from thetarget sound source in the spatial region centered at −90°.

FIG. 17D exemplarily illustrates the computer simulation resultrepresenting a 3D display of the directivity pattern of the four-sensormicrophone array 201 when the target sound signal is received from thetarget sound source in the spatial region centered at 0°. FIG. 17Eexemplarily illustrates the computer simulation result representing a 2Ddisplay of the directivity pattern of the four-sensor microphone array201 when the target sound signal is received from the target soundsource in the spatial region centered at 0°.

FIG. 17F exemplarily illustrates the computer simulation resultrepresenting a 3D display of the directivity pattern of the four-sensormicrophone array 201 when the target sound signal is received from thetarget sound source in the spatial region centered at 90°. FIG. 17Gexemplarily illustrates the computer simulation result representing a 2Ddisplay of the directivity pattern of the four-sensor microphone array201 when the target sound signal is received from the target soundsource in the spatial region centered at 90°.

FIG. 17H exemplarily illustrates the computer simulation resultrepresenting a 3D display of the directivity pattern of the four-sensormicrophone array 201 when the target sound source is received from thetarget sound source in the spatial region centered at 180°. FIG. 17Iexemplarily illustrates the computer simulation result representing a 2Ddisplay of the directivity pattern of the four-sensor microphone array201 when the target sound source is received from the target soundsource in the spatial region centered at 180°. The 3D displays of thedirectivity patterns in FIG. 17B, FIG. 17D, FIG. 17F, and FIG. 17Hdemonstrate that the passbands have the same height. The 2D displays ofthe directivity patterns in FIG. 17C, FIG. 17E, FIG. 17G, and FIG. 17Idemonstrate that the passbands have the same width along the frequencyand demonstrates the broadband properties of the microphone array 201.

FIGS. 18A-18B exemplarily illustrates a microphone array configurationfor a tablet computer. In this example, four sound sensors 301 of themicrophone array 201 are positioned on a frame 1801 of the tabletcomputer, for example, the iPad® of Apple Inc. Geometrically, the soundsensors 301 are distributed on the circle 302 as exemplarily in FIG.18B. The radius of the circle 302 is equal to the width of the tabletcomputer. The angle θ between the sound sensors 301 M₂ and M₃ isdetermined to avoid spatial aliasing up to 4000 Hz. This microphonearray configuration enhances a front speaker's voice and suppressesbackground ambient noise. The adaptive beamforming unit 203 configuresthe microphone array 201 to form an acoustic beam 1802 pointingfrontwards using the method and system 200 disclosed herein. The targetsound signal, that is, the front speaker's voice within the range ofΦ<30° is enhanced compared to the sound signals from other directions.

FIG. 18C exemplarily illustrates an acoustic beam 1802 formed using themicrophone array configuration of FIGS. 18A-18B according to the methodand system 200 disclosed herein.

FIGS. 18D-18G exemplarily illustrates graphs showing processing resultsof the adaptive beamforming unit 203 and the noise reduction unit 207for the microphone array configuration of FIG. 18B, in both a timedomain and a spectral domain for the tablet computer. Consider anexample where a speaker is talking in front of the tablet computer withambient noise signals on the side. FIG. 18D exemplarily illustrates agraph showing the performance of the microphone array 201 beforeperforming beamforming and noise reduction with a signal-to-noise ratio(SNR) of 15 dB. FIG. 18E exemplarily illustrates a graph showing theperformance of the microphone array 201 after performing beamforming andnoise reduction, according to the method disclosed herein, with an SNRof 15 dB. FIG. 18F exemplarily illustrates a graph showing theperformance of the microphone array 201 before performing beamformingand noise reduction with an SNR of 0 dB. FIG. 18G exemplarilyillustrates a graph showing the performance of the microphone array 201after performing beamforming and noise reduction, according to themethod disclosed herein, with an SNR of 0 dB.

It can be seen from FIGS. 18D-18G that the performance graph is noisierfor the microphone array 201 before the beamforming and noise reductionis performed. Therefore, the adaptive beamforming unit 203 and the noisereduction unit 207 of the microphone array system 200 disclosed hereinsuppresses ambient noise signals while maintaining the clarity of thetarget sound signal, for example, the speech signal.

FIGS. 19A-19F exemplarily illustrate tables showing different microphonearray configurations and the corresponding values of delay τ_(n) for thesound sensors 301 in each of the microphone array configurations. Thebroadband beamforming method disclosed herein can be used for microphonearrays 201 with arbitrary numbers of sound sensors 301 and arbitrarylocations of the sound sensors 301. The sound sensors 301 can be mountedon surfaces or edges of any speech acquisition device. For any specificmicrophone array configuration, the only parameter that needs to bedefined to achieve the beamformer coefficients is the value of τ_(n) foreach sound sensor 301 as disclosed in the detailed description of FIG.5, FIGS. 6A-6B, and FIGS. 7A-7C and as exemplarily illustrated in FIGS.19A-19F. In an example, the microphone array configuration exemplarilyillustrated in FIG. 19F is implemented on a handheld device forhands-free speech acquisition. In a hands-free and non-close talkingscenario, a user prefers to talk in distance rather than speaking closeto the sound sensor 301 and may want to talk while watching a screen ofthe handheld device. The microphone array system 200 disclosed hereinallows the handheld device to pick up sound signals from the directionof the speaker's mouth and suppress noise from other directions. Themethod and system 200 disclosed herein may be implemented on any deviceor equipment, for example, a voice recorder where a target sound signalor speech needs to be enhanced.

The foregoing examples have been provided merely for the purpose ofexplanation and are in no way to be construed as limiting of the presentinvention disclosed herein. While the invention has been described withreference to various embodiments, it is understood that the words, whichhave been used herein, are words of description and illustration, ratherthan words of limitation. Further, although the invention has beendescribed herein with reference to particular means, materials andembodiments, the invention is not intended to be limited to theparticulars disclosed herein; rather, the invention extends to allfunctionally equivalent structures, methods and uses, such as are withinthe scope of the appended claims. Those skilled in the art, having thebenefit of the teachings of this specification, may affect numerousmodifications thereto and changes may be made without departing from thescope and spirit of the invention in its aspects.

We claim:
 1. A method for enhancing a target sound signal from aplurality of sound signals, comprising: providing a microphone arraysystem comprising an array of sound sensors positioned in an arbitrary alinear, circular, or other configuration, a sound source localizationunit, an adaptive beamforming unit, and a noise reduction unit, whereinsaid sound source localization unit, said adaptive beamforming unit, andsaid noise reduction unit are integrated in a digital signal processor,and wherein said sound source localization unit, said adaptivebeamforming unit, and said noise reduction unit are in operativecommunication with said array of said sound sensors; receiving saidsound signals from a plurality of disparate sound sources by said soundsensors, wherein said received sound signals comprise said target soundsignal from a target sound source among said disparate sound sources,and ambient noise signals; determining a delay between each of saidsound sensors and an origin of said array of said sound sensors as afunction of distance between each of said sound sensors and said origin,a predefined angle between each of said sound sensors and a referenceaxis, and an azimuth angle between said reference axis and said targetsound signal, when said target sound source that emits said target soundsignal is in a two dimensional plane, wherein said delay is representedin terms of number of samples, and wherein said determination of saiddelay enables beamforming for arbitrary numbers of said array of soundsensors and in a plurality of arbitrary configurations of said array ofsaid sound sensors; estimating a spatial location of said target soundsignal from said received sound signals by said sound sourcelocalization unit; performing adaptive beamforming for steering adirectivity pattern of said array of said sound sensors in a directionof said spatial location of said target sound signal by said adaptivebeamforming unit, wherein said adaptive beamforming unit enhances saidtarget sound signal and partially suppresses said ambient noise signals;and suppressing said ambient noise signals by said noise reduction unitfor further enhancing said target sound signal.
 2. The method of claim1, wherein said spatial location of said target sound signal from saidtarget sound source is estimated using a steered response power-phasetransform by said sound source localization unit.
 3. The method of claim1, wherein said adaptive beamforming comprises: providing a fixedbeamformer, a blocking matrix, and an adaptive filter in said adaptivebeamforming unit; steering said directivity pattern of said array ofsaid sound sensors in said direction of said spatial location of saidtarget sound signal from said target sound source by said fixedbeamformer for enhancing said target sound signal, when said targetsound source is in motion; feeding said ambient noise signals to saidadaptive filter by blocking said target sound signal received from saidtarget sound source using said blocking matrix; and adaptively filteringsaid ambient noise signals by said adaptive filter in response todetecting one of presence and absence of said target sound signal insaid sound signals received from said disparate sound sources.
 4. Themethod of claim 3, wherein said fixed beamformer performs fixedbeamforming by filtering and summing output sound signals from saidsound sensors.
 5. The method of claim 3, wherein said adaptive filteringcomprises sub-band adaptive filtering performed by said adaptive filter,wherein said sub-band adaptive filtering comprises: providing ananalysis filter bank, an adaptive filter matrix, and a synthesis filterbank in said adaptive filter; splitting said enhanced target soundsignal from said fixed beamformer and said ambient noise signals fromsaid blocking matrix into a plurality of frequency sub-bands by saidanalysis filter bank; adaptively filtering said ambient noise signals ineach of said frequency sub-bands by said adaptive filter matrix inresponse to detecting one of presence and absence of said target soundsignal in said sound signals received from said disparate sound sources;and synthesizing a full-band sound signal using said frequency sub-bandsof said enhanced target sound signal by said synthesis filter bank. 6.The method of claim 3, wherein said adaptive beamforming furthercomprises detecting said presence of said target sound signal by anadaptation control unit provided in said adaptive beamforming unit andadjusting a step size for said adaptive filtering in response todetecting one of said presence and said absence of said target soundsignal in said sound signals received from said disparate sound sources.7. The method of claim 1, wherein said noise reduction unit performsnoise reduction by using one of a Wiener-filter based noise reductionalgorithm, a spectral subtraction noise reduction algorithm, an auditorytransform based noise reduction algorithm, and a model based noisereduction algorithm.
 8. The method of claim 1, wherein said noisereduction unit performs noise reduction in a plurality of frequencysub-bands, wherein said frequency sub-bands are employed by an analysisfilter bank of said adaptive beamforming unit for sub-band adaptivebeamforming.
 9. A system for enhancing a target sound signal from aplurality of sound signals, comprising: an array of sound sensorspositioned in an arbitrary a linear, circular, or other configuration,wherein said sound sensors receive said sound signals from a pluralityof disparate sound sources, wherein said received sound signals comprisesaid target sound signal from a target sound source among said disparatesound sources, and ambient noise signals; a digital signal processor,said digital signal processor comprising: a sound source localizationunit that estimates a spatial location of said target sound signal fromsaid received sound signals, by determining a delay between each of saidsound sensors and an origin of said array of said sound sensors as afunction of distance between each of said sound sensors and said origin,a predefined angle between each of said sound sensors and a referenceaxis, and an azimuth angle between said reference axis and said targetsound signal, when said target sound source that emits said target soundsignal is in a two dimensional plane, wherein said delay is representedin terms of number of samples, and wherein said determination of saiddelay enables beamforming for arbitrary numbers of said array of soundsensors and in a plurality of arbitrary configurations of said array ofsaid sound sensors; an adaptive beamforming unit that steers directivitypattern of said array of said sound sensors in a direction of saidspatial location of said target sound signal, wherein said adaptivebeamforming unit enhances said target sound signal and partiallysuppresses said ambient noise signals; and a noise reduction unit thatsuppresses said ambient noise signals for further enhancing said targetsound signal.
 10. The system of claim 9, wherein said sound sourcelocalization unit estimates said spatial location of said target soundsignal from said target sound source using a steered responsepower-phase transform.
 11. The system of claim 9, wherein said adaptivebeamforming unit comprises: a fixed beamformer that steers saiddirectivity pattern of said array of said sound sensors in saiddirection of said spatial location of said target sound signal from saidtarget sound source for enhancing said target sound signal, when saidtarget sound source is in motion; a blocking matrix that feeds saidambient noise signals to an adaptive filter by blocking said targetsound signal received from said target sound source; and said adaptivefilter that adaptively filters said ambient noise signals in response todetecting one of presence and absence of said target sound signal insaid sound signals received from said disparate sound sources.
 12. Thesystem of claim 11, wherein said fixed beamformer performs fixedbeamforming by filtering and summing output sound signals from saidsound sensors.
 13. The system of claim 11, wherein said adaptive filtercomprises a set of sub-band adaptive filters comprising: an analysisfilter bank that splits said enhanced target sound signal from saidfixed beamformer and said ambient noise signals from said blockingmatrix into a plurality of frequency sub-bands; an adaptive filtermatrix that adaptively filters said ambient noise signals in each ofsaid frequency sub-bands in response to detecting one of presence andabsence of said target sound signal in said sound signals received fromsaid disparate sound sources; and a synthesis filter bank thatsynthesizes a full-band sound signal using said frequency sub-bands ofsaid enhanced target sound signal.
 14. The system of claim 9, whereinsaid adaptive beamforming unit further comprises an adaptation controlunit that detects said presence of said target sound signal and adjustsa step size for said adaptive filtering in response to detecting one ofsaid presence and said absence of said target sound signal in said soundsignals received from said disparate sound sources.
 15. The system ofclaim 9, wherein said noise reduction unit is one of a Wiener-filterbased noise reduction unit, a spectral subtraction noise reduction unit,an auditory transform based noise reduction unit, and a model basednoise reduction unit.
 16. The system of claim 9, further comprising oneor more audio codecs that convert said sound signals in an analog formof said sound signals into digital sound signals and reconverts saiddigital sound signals into said analog form of said sound signals. 17.The system of claim 9, wherein said noise reduction unit performs noisereduction in a plurality of frequency sub-bands employed by an analysisfilter bank of said adaptive beamforming unit for sub-band adaptivebeamforming.
 18. The system of claim 9, wherein said array of said soundsensors is one of a linear array of said sound sensors, a circular arrayof said sound sensors, and an arbitrarily distributed coplanar array ofsaid sound sensors.
 19. The method of claim 1, wherein said delay (τ) isdetermined by a formula τ=f_(s)*t, wherein f_(s) is a sampling frequencyand t is a time delay calculated based on said number of samples withina time period and a time delay for said target sound signal to travelsaid distance between each of said sound sensors in said microphonearray and said origin of said array of said sound sensors, and whereinsaid distance between said each of said sound sensors in the microphonearray and said origin of said array of said sound sensors can be same ordifferent.
 20. A method for enhancing a target sound signal from aplurality of sound signals, comprising: providing a microphone arraysystem comprising an array of sound sensors positioned in an arbitrary alinear, circular, or other configuration, a sound source localizationunit, an adaptive beamforming unit, and a noise reduction unit, whereinsaid sound source localization unit, said adaptive beamforming unit, andsaid noise reduction unit are integrated in a digital signal processor,and wherein said sound source localization unit, said adaptivebeamforming unit, and said noise reduction unit are in operativecommunication with said array of said sound sensors; receiving saidsound signals from a plurality of disparate sound sources by said soundsensors, wherein said received sound signals comprise said target soundsignal from a target sound source among said disparate sound sources,and ambient noise signals; determining a delay between each of saidsound sensors and an origin of said array of said sound sensors as afunction of distance between each of said sound sensors and said origin,a predefined angle between each of said sound sensors and a firstreference axis, an elevation angle between a second reference axis andsaid target sound signal, and an azimuth angle between said firstreference axis and said target sound signal, when said target soundsource that emits said target sound signal is in a three dimensionalplane, wherein said delay is represented in terms of number of samples,and wherein said determination of said delay enables beamforming forarbitrary numbers of said array of sound sensors and in a plurality ofarbitrary configurations of said array of said sound sensors; estimatinga spatial location of said target sound signal from said received soundsignals by said sound source localization unit; performing adaptivebeamforming for steering a directivity pattern of said array of saidsound sensors in a direction of said spatial location of said targetsound signal by said adaptive beamforming unit, wherein said adaptivebeamforming unit enhances said target sound signal and partiallysuppresses said ambient noise signals; and suppressing said ambientnoise signals by said noise reduction unit for further enhancing saidtarget sound signal.
 21. A system for enhancing a target sound signalfrom a plurality of sound signals, comprising: an array of sound sensorspositioned in an arbitrary a linear, circular, or other configuration,wherein said sound sensors receive said sound signals from a pluralityof disparate sound sources, wherein said received sound signals comprisesaid target sound signal from a target sound source among said disparatesound sources, and ambient noise signals; a digital signal processor,said digital signal processor comprising:a sound source localizationunit that estimates a spatial location of said target sound signal fromsaid received sound signals by determining a delay between each of saidsound sensors and an origin of said array of said sound sensors as afunction of distance between each of said sound sensors and said origin,a predefined angle between each of said sound sensors and a firstreference axis, an elevation angle between a second reference axis andsaid target sound signal, and an azimuth angle between said firstreference axis and said target sound signal, when said target soundsource that emits said target sound signal is in a three dimensionalplane, wherein said delay is represented in terms of number of samples,and wherein said determination of said delay enables beamforming forarbitrary numbers of said array of sound sensors and in a plurality ofarbitrary configurations of said array of said sound sensors; anadaptive beamforming unit that steers directivity pattern of said arrayof said sound sensors in a direction of said spatial location of saidtarget sound signal, wherein said adaptive beamforming unit enhancessaid target sound signal and partially suppresses said ambient noisesignals; and a noise reduction unit that suppresses said ambient noisesignals for further enhancing said target sound signal.
 22. A method forenhancing a target sound signal from a plurality of sound signals,comprising: providing a microphone array system comprising an array ofsound sensors, a sound source localization unit, a beamforming unit, anda noise reduction unit, wherein said sound source localization unit,said beamforming unit, and said noise reduction unit are integrated in adigital signal processor, and wherein said sound source localizationunit, said beamforming unit, and said noise reduction unit are inoperative communication with said array of said sound sensors; receivingsaid sound signals from a plurality of disparate sound sources by saidsound sensors, wherein said received sound signals comprise said targetsound signal from a target sound source among said disparate soundsources, and ambient noise signals; determining a delay between each ofsaid sound sensors and a reference point of said array of said soundsensors as a function of distance between each of said sound sensors andsaid reference point, a predefined angle between each of said soundsensors and a reference axis, and an azimuth angle between saidreference axis and said target sound signal, when said target soundsource that emits said target sound signal is in a two dimensionalplane, wherein said delay is represented in terms of number of samples,and wherein said determination of said delay enables beamforming for twoor more of said sound sensors; estimating a spatial location of saidtarget sound signal from said received sound signals by said soundsource localization unit; performing beamforming for steering adirectivity pattern of said array of said sound sensors in a directionof said spatial location of said target sound signal by said beamformingunit, wherein said beamforming unit enhances said target sound signaland partially suppresses said ambient noise signals; and suppressingsaid ambient noise signals by said noise reduction unit for furtherenhancing said target sound signal.
 23. The method of claim 22, whereinsaid beamforming comprises: providing a fixed beamformer, a blockingmatrix, and an adaptive filter in said beamforming unit; steering saiddirectivity pattern of said array of said sound sensors in saiddirection of said spatial location of said target sound signal from saidtarget sound source by said fixed beamformer for enhancing said targetsound signal, when said target sound source is in motion; feeding saidambient noise signals to said adaptive filter by blocking said targetsound signal received from said target sound source using said blockingmatrix; and adaptively filtering said ambient noise signals by saidadaptive filter in response to detecting one of presence and absence ofsaid target sound signal in said sound signals received from saiddisparate sound sources.
 24. The method of claim 23, wherein saidbeamforming further comprises detecting said presence of said targetsound signal by an adaptation control unit provided in said beamformingunit and adjusting a step size for said adaptive filtering in responseto detecting one of said presence and said absence of said target soundsignal in said sound signals received from said disparate sound sources.25. The method of claim 22, wherein said noise reduction unit performsnoise reduction in a plurality of frequency sub-bands, wherein saidfrequency sub-bands are employed by an analysis filter bank of saidbeamforming unit for sub-band adaptive beamforming.
 26. A system forenhancing a target sound signal from a plurality of sound signals,comprising: an array of sound sensors, wherein said sound sensorsreceive said sound signals from a plurality of disparate sound sources,wherein said received sound signals comprise said target sound signalfrom a target sound source among said disparate sound sources, andambient noise signals; a digital signal processor, said digital signalprocessor comprising: a sound source localization unit that estimates aspatial location of said target sound signal from said received soundsignals, by determining a delay between each of said sound sensors and areference point of said array of said sound sensors as a function ofdistance between each of said sound sensors and said reference point, apredefined angle between each of said sound sensors and a referenceaxis, and an azimuth angle between said reference axis and said targetsound signal, when said target sound source that emits said target soundsignal is in a two dimensional plane, wherein said delay is representedin terms of number of samples, and wherein said determination of saiddelay enables beamforming for two or more of said sound sensors; abeamforming unit that steers directivity pattern of said array of saidsound sensors in a direction of said spatial location of said targetsound signal, wherein said beamforming unit enhances said target soundsignal and partially suppresses said ambient noise signals; and a noisereduction unit that suppresses said ambient noise signals for furtherenhancing said target sound signal.
 27. The system of claim 26, whereinsaid beamforming unit further comprises an adaptation control unit thatdetects said presence of said target sound signal and adjusts a stepsize for said adaptive filtering in response to detecting one of saidpresence and said absence of said target sound signal in said soundsignals received from said disparate sound sources.
 28. The system ofclaim 26, wherein said noise reduction unit performs noise reduction ina plurality of frequency sub-bands employed by an analysis filter bankof said beamforming unit for sub-band adaptive beamforming.
 29. Thesystem of claim 26, wherein said array of said sound sensors is one of alinear array of said sound sensors, and a circular array of said soundsensors, and other types of array of said sound sensors.
 30. A methodfor enhancing a target sound signal from a plurality of sound signals,comprising: providing a microphone array system comprising an array ofsound sensors, a sound source localization unit, a beamforming unit, anda noise reduction unit, wherein said sound source localization unit,said beamforming unit, and said noise reduction unit are integrated in adigital signal processor, and wherein said sound source localizationunit, said beamforming unit, and said noise reduction unit are inoperative communication with said array of said sound sensors; receivingsaid sound signals from a plurality of disparate sound sources by saidsound sensors, wherein said received sound signals comprise said targetsound signal from a target sound source among said disparate soundsources, and ambient noise signals; determining a delay between each ofsaid sound sensors and a reference point of said array of said soundsensors as a function of distance between each of said sound sensors andsaid reference point, a predefined angle between each of said soundsensors and a first reference axis, an elevation angle between a secondreference axis and said target sound signal, and an azimuth anglebetween said first reference axis and said target sound signal, whensaid target sound source that emits said target sound signal is in athree dimensional plane, wherein said delay is represented in terms ofnumber of samples, and wherein said determination of said delay enablesbeamforming for two or more of said sound sensors: estimating a spatiallocation of said target sound signal from said received sound signals bysaid sound source localization unit; performing beamforming for steeringa directivity pattern of said array of said sound sensors in a directionof said spatial location of said target sound signal by said beamformingunit, wherein said beamforming unit enhances said target sound signaland partially suppresses said ambient noise signals; and suppressingsaid ambient noise signals by said noise reduction unit for furtherenhancing said target sound signal.
 31. A system for enhancing a targetsound signal from a plurality of sound signals, comprising: an array ofsound sensors, wherein said sound sensors receive said sound signalsfrom a plurality of disparate sound sources, wherein said received soundsignals comprise said target sound signal from a target sound sourceamong said disparate sound sources, and ambient noise signals; a digitalsignal processor, said digital signal processor comprising: a soundsource localization unit that estimates a spatial location of saidtarget sound signal from said received sound signals by determining adelay between each of said sound sensors and a reference point of saidarray of said sound sensors as a function of distance between each ofsaid sound sensors and said reference point, a predefined angle betweeneach of said sound sensors and a first reference axis, an elevationangle between a second reference axis and said target sound signal, andan azimuth angle between said first reference axis and said target soundsignal, when said target sound source that emits said target soundsignal is in a three dimensional plane, wherein said delay isrepresented in terms of number of samples, and wherein saiddetermination of said delay enables beamforming for two or more of saidsound sensors; a beamforming unit that steers directivity pattern ofsaid array of said sound sensors in a direction of said spatial locationof said target sound signal, wherein said beamforming unit enhances saidtarget sound signal and partially suppresses said ambient noise signals;and a noise reduction unit that suppresses said ambient noise signalsfor further enhancing said target sound signal.
 32. A system forenhancing a target sound signal from a plurality of sound signals,comprising: an array of sound sensors, wherein said sound sensorsreceive said sound signals from a plurality of disparate sound sources,wherein said received sound signals comprise said target sound signalfrom a target sound source among said disparate sound sources, andambient noise signals; a digital signal processor, said digital signalprocessor comprising: a sound source localization unit that estimates aspatial location of said target sound signal from said received soundsignals by determining a delay between each of said sound sensors and areference point of said array of said sound sensors as a function ofdistance between each of said sound sensors and said reference point andan angle of each of said sound sensors biased from a reference axis; abeamforming unit that enhances said target sound signal and suppressessaid ambient noise signals; and a noise reduction unit that suppressessaid ambient noise signals.
 33. A system for enhancing a target soundsignal from a plurality of sound signals, comprising: an array of soundsensors, wherein said sound sensors receive said sound signals from aplurality of disparate sound sources, wherein said received soundsignals comprise said target sound signal from a target sound sourceamong said disparate sound sources, and ambient noise signals; a digitalsignal processor, said digital signal processor comprising: a soundsource localization unit that estimates a spatial location of saidtarget sound signal from said received sound signals by determining adelay between each of said sound sensors and a reference point of saidarray of said sound sensors as a function of distance between each ofsaid sound sensors and said reference point, a predefined angle betweeneach of said sound sensors and a reference axis and an azimuth anglebetween said reference axis and said target sound signal; a beamformingunit that enhances said target sound signal and suppresses said ambientnoise signals; and a noise reduction unit that suppresses said ambientnoise signals.
 34. A system for enhancing a target sound signal from aplurality of sound signals, comprising: an array of sound sensors,wherein said sound sensors receive said sound signals from a pluralityof disparate sound sources, wherein said received sound signals comprisesaid target sound signal from a target sound source among said disparatesound sources, and ambient noise signals; a digital signal processor,said digital signal processor comprising: a sound source localizationunit that estimates a spatial location of said target sound signal fromsaid received sound signals by determining a delay between each of saidsound sensors and a reference point of said array of said sound sensorsas a function of distance between each of said sound sensors and saidreference point, a predefined angle between each of said sound sensorsand a first reference axis, an elevation angle between a secondreference axis and said target sound signal and an azimuth angle betweensaid first reference axis and said target sound signal; a beamformingunit that enhances said target sound signal and suppresses said ambientnoise signals; and a noise reduction unit that suppresses said ambientnoise signals.
 35. A system for enhancing a target sound signal from aplurality of sound signals, comprising: an array of sound sensorspositioned in a non-circular configuration, wherein said sound sensorsreceive said sound signals from a plurality of disparate sound sources,wherein said received sound signals comprise said target sound signalfrom a target sound source among said disparate sound sources, andambient noise signals; a digital signal processor, said digital signalprocessor comprising: a sound source localization unit that estimates aspatial location of said target sound signal from said received soundsignals by determining a delay between each of said sound sensors and areference point of said array of said sound sensors as a function ofdistance between each of said sound sensors and said reference point andan angle of each of said sound sensors biased from a reference axis,wherein said distance between each of said sound sensors and saidreference point varies from a minimum value to a maximum value, andwherein said minimum value corresponds to zero and said maximum value isdefined based on a limitation associated with size of said system; abeamforming unit that enhances said target sound signal and suppressessaid ambient noise signals; and a noise reduction unit that suppressessaid ambient noise signals.